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Objectives 


The  proposal  had  three  original  objectives:  {1 )  extension  of  central  bottleneck  models 
as  the  basis  for  computational  models  of  sequence  behavior,  (2)  emergent  properties 
in  scheduling  behavioral  sequences;  and  (3)  optimising  performance  in  sequence 
behavior.  The  objectives  have  broadened  to  include  reinforcement  learning  in 
sampling  spatially  distributed  probabilistic  information  sources.  Not  only  is 
variability  in  the  spatial  distribution  of  information  a  central  feature  of  many  military 
environments  (e.g.,  radar  operations),  its  study  will  also  serve  as  the  foundation  for 
generalizing  results  with  linear  scan  paths  characteristic  of  reading  to  fully  2- 
dimensional  scans  characteristic  of  knowledge  intensive  tasks,  such  as  radar 
operations. 


Status  of  Effort 

The  first  year’s  efforts  were  focused  on  three  principal  objectives:  (1 )  the  role  of 
preparation  in  observed  data  patterns,  (2)  learning  to  sample  spatially  disparate 
information  sources  based  on  reward  patterns,  (3)  exploring  just-in-time  scheduling  of 
central  resources  as  an  optimality  criterion.  Experiments  examining  the  role  of 
preparation  investigated  possible  sources  for  the  elevation  of  RT1,  a  significant 
emergent  property  seen  in  multiple  response  sequences  that  is  not  observed  with 
single  discrete  responses.  Experiments  exploring  learning  and  adapting  to  the  relative 
information  value  of  separate  spatial  locations  provided  initial  insights  into  how 
behavior  becomes  optimised  over  time,  and  the  resource  demands  of  that  learning.  It 
will  also  provide  the  empirical  basis  for  optimal  models  of  behavior  in  a 
reinforcement-learning  model,  Computational  modeling  addressed  the  issue  of 
optimal  strategies  by  proposing  a  simple  model  for  efficient  regular  saccades  designed 
to  minimize  the  variance  in  the  eye  movements  and  achieve  a  "just-in-time” 
scheduling  of  central  operations,  The  central  bottleneck  has  provided  the  cognitive 
architecture  for  existing  models.  Experiments  are  currently  being  designed  to  test  the 
simple  model  and  to  relate  the  demands  imposed  by  the  task  to  the  strategy  adopted. 

Accomplishments 

Several  key  empirical  results  emerged  from  the  experiments  conducted  in  the  first 
year.  The  role  of  preparation  was  addressed  in  two  studies  examining  the  effects  of 
number  of  items,  and  eye  movement  requirements  on  RT1  elevation.  No  differences 
in  RT1,  IRI,  or  dwell  time  were  found  for  either  manipulation.  IfRTl  reflects 
preparation  that  preparation  is  not  a  function  of  sequence  length  or  the  presence  of 
eye  movements.  RT1  elevation  could  represent  a  kind  of  "first-trial  cost”  as  often 
observed  in  task  switching  studies.  We  are  designing  experiments  to  relate  the  two 
findings. 

Alternatively,  the  cost  could  be  related  to  initializing  a  sequence  of  actions 
irrespective  of  the  number  of  items,  Results  of  an  experiment  incorporating  go  and 
no-go  stimuli  in  the  trial  sequence  suggest  this  is  one  of  perhaps  multiple  components 
to  RT1  elevation.  RT1  elevation  was  reduced  by  approximately  120  ms  when  the  first 
item  was  a  no-go  stimulus  (RT1  was  made  to  the  second  item  in  the  sequence).  This 
suggests  some  component  of  sequence  programming  or  initialisation  plus  a 
component  due  to  response  selection  or  retrieval.  In  a  separate  condition,  we  found 


that  RT!  was  reduced  by  over  70  ms  when  subjects  were  instructed  to  respond  to  the 
first  item  only  and  ignore  the  rest,  compared  to  a  condition  where  they  responded  to 
the  first  item  and  were  instructed  to  simply  fixate  the  remaining  items  in  tum. 
Interestingly,  fixation  durations  for  target  items  did  not  differ  systematically  with 
their  position  in  the  sequence.  This  is  a  pattern  we  have  observed  now  in  several 
experiments  and  points  to  a  decoupling  of  fixations  and  manual  responses  under  at 
least  some  conditions.  'I "here  are  interactions  between  neighboring  items  that  are 
currently  being  explored  both  with  computational  models  to  examine  possible 
pushback  effects,  and  with  further  experiments.  We  are  designing  experiments  to  see 
how  this  pattern  is  altered  as  the  complexity  of  the  eye  fixation  pattern  is  varied  and 
information  sources  are  conditioned  on  prior  reinforcement. 

We  have  conducted  an  experiment  varying  the  difficulty  of  items  within  a  sequence. 
Mere,  as  in  reading,  fixation  durations  are  longer  for  the  more  difficult  items,  as  is 
RT1.  However,  as  in  earlier  experiments  with  blocked  difficulty  manipulations,  the 
increase  in  fixation  duration  is  less  than  the  increase  in  RT.  This  result  has  also  been 
noted  in  studies  of  eye  movements  in  reading.  Further  experiments  with  varied 
difficulty  under  time  pressure  constraints  will  further  probe  these  effects. 

Several  alternative  computational  models  of  the  range  of  empirical  findings  have  been 
developed  and  reported  in  the  first  year.  All  of  them  are  variants  of  centra!  bottleneck 
postponement  models  that  differ  in  the  control  of  saccade  initiation.  Models  that 
assume  a  saccade  is  generated  following  a  fixed  stage  of  processing  tend  to  produce 
constant  eye-hand  spans.  That  is,  they  fail  to  adequately  decouple  manual  from  ocular 
responses.  As  an  alternative  we  develop  a  “just-in-time"  model,  which  provided  good 
fits  to  these  data  assuming  subjects  attempted  to  meet  two  optimization  criteria: 
minimization  of  eye  movement  variability  and  a  just-in-time  scheduling  of  central 
stages  to  eliminate  wait  states  or  data  storage.  The  minimum  variance  assumption 
captures  the  behavior  of  a  simple  automatic  movement  generator  producing  saccades 
without  interfering  with  stimulus  processing  and  without  deliberate  intent  (central 
processor  involvement).  The  just-in-time  scheduling  of  central  stages  reflects  an 
efficient  saccade  generator  where  the  period  of  saccade  generation  is  adapted  to  the 
overall  information  processing  demands.  That  is,  the  model  chooses  a  periodic 
movement  that  minimizes  the  overall  time  between  successive  central  bottleneck 
stages.  This  simple  just-in-time  model  provided  good  fits  to  the  blocked  difficulty 
data  and  was  able  to  handle  the  go/no-go  data  by  assuming  that  the  no-go  stimulus 
triggered  a  saccade  as  a  response.  Refinements  of  the  model  are  in  progress  to  account 
for  the  difficulty  data. 

The  two  modeling  approaches  adopted  mirror  the  controversy  in  the  reading  and 
visual  search  literature  over  "process  control”  and  “global  estimation"  models. 

Process  control  models  assume  that  saccades  are  triggered  by  the  completion  of  some 
processing  stage.  Global  estimation  models  assume  that  the  system  adjusts  the  period 
over  learning  to  keep  fixation  times  approximately  equal.  It  is  clear  that  some 
knowledge  of  the  state  of  internal  processing  is  required  to  account  for  the  increased 
duration  of  fixations  on  individual  items.  The  issue  is  what  information  is  used  and 
how  it  is  integrated  with  putative  mechanisms  that  schedule  periodic  saccades.  We  are 
exploring  ex  tensions  of  existing  ideal  observer  models  of  eye  fixation  location  (e.g. 
Mr.  Chips,  Legge  et  al.,  1997)  to  include  timing  of  saccades.  In  addition,  we  are 
exploring  reinforcement  learning  models  to  account  for  the  fixation  patterns  in  the 


probability  learning  experiments,  which  have  shown  convergence  of  eye  fixation  on 
locations  in  accordance  with  their  information  value. 

In  the  Pashler  lab  several  studies  have  been  conducted  to  examine  the  effects  of 
reward  gradients  on  spatial  response  selection.  This  has  been  studied  using  mouse 
movements  in  several  different  tasks  invol  ving  different  kinds  of  reward  gradients.  In 
some,  there  is  an  optimal  x,y  position  which  the  subject  attempts  to  locate  using 
sequential  mouse  clicks  (seeing  a  reward  after  each  response).  In  others,  there  are  two 
independent  dimensions  (time  and  space)  being  controlled,  with  individual  reward 
values  associated  with  each,  One  question  concerned  whether  subjects  could 
simultaneously  update  the  temporal  and  spatial  dimensions  of  their  responses  based 
on  reward  signals.  To  our  surprise,  it  seems  that  this  is  possible.  In  another 
experiment,  we  are  assessing  reinforcement  learning  strategies,  and  hope  to  model  the 
results  using  temporal  difference  learning  (TD)  algorithm.  In  recent  work  we  have 
also  extended  the  x,y  task  to  oculomotor  responses,  So  far  it  seems  that  while 
subjects  response  more  quickly,  the  same  basic  strategies  may  be  emerging.  We  hope 
to  write  up  some  of  our  initial  results  using  the  2-dimensiona!  adjustment  task  quite 
soon. 
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